Wrapper feature selection with partially labeled data
نویسندگان
چکیده
In this paper, we propose a new feature selection approach with partially labeled training examples in the multi-class classification setting. It is based on modification of genetic algorithm that creates and evaluates candidate subsets during an evolutionary process, taking into account weights recursively eliminating irrelevant features. To increase variety data, unlabeled observations are employed namely by pseudo-labeling them using self-learning recently proposed transductive policy. Empirical results different data sets show effectiveness our method compared to several state-of-the-art semi-supervised approaches.
منابع مشابه
A Wrapper Feature Selection Approach to Classification with Missing Data
Many industrial and real-world datasets suffer from an unavoidable problem of missing values. The problem of missing data has been addressed extensively in the statistical analysis literature, and also, but to a lesser extent in the classification literature. The ability to deal with missing data is an essential requirement for classification because inadequate treatment of missing data may lea...
متن کاملWrapper Feature Selection
INTRODUCTION It is well known that the performance of most data mining algorithms can be deteriorated by features that do not add any value to learning tasks. Feature selection can be used to limit the effects of such features by seeking only the relevant subset from the original features (de Souza et al., 2006). This subset of the relevant features is discovered by removing those that are cons...
متن کاملWrapper for Ranking Feature Selection
We propose a new feature selection criterion not based on calculated measures between attributes, or complex and costly distance calculations. Applying a wrapper to the output of a new attribute ranking method, we obtain a minimum subset with the same error rate as the original data. The experiments were compared to two other algorithms with the same results, but with a very short computation t...
متن کاملOptimizing Wrapper-Based Feature Selection for Use on Bioinformatics Data
High dimensionality (having a large number of independent attributes) is a major problem for bioinformatics datasets such as gene microarray datasets. Feature selection algorithms are necessary to remove the irrelevant (not useful) and redundant (contain duplicate information) features. One approach to handle this problem is wrapper-based subset evaluation, which builds classification models on...
متن کاملParallel GA-Based Wrapper Feature Selection for Spectroscopic Data Mining
Mining predictive models in dense databases is CPU time consuming and I/O intensive. In this paper, we propose a taxonomy of existing techniques allowing to achieve high performance. We propose a hybrid approach allowing to exploit four of them: feature selection, GA-based exploration space reduction, parallelism and concurrency. The approach is experimented on a near-infrared ( ) spectroscopic...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
ژورنال
عنوان ژورنال: Applied Intelligence
سال: 2022
ISSN: ['0924-669X', '1573-7497']
DOI: https://doi.org/10.1007/s10489-021-03076-w